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Abstract —This paper considers the problem of robustly esti¬ 
mating a structured covariance matrix with an elliptical under¬ 
lying distribution with known mean. In applications where the 
covariance matrix naturally possesses a certain structure, taking 
the prior structure information into account in the estimation 
procedure is beneficial to improve the estimation accuracy. 
We propose incorporating the prior structure information into 
Tyler’s M-estlmator and formulate the problem as minimizing 
the cost function of Tyler’s estimator under the prior structural 
constraint. First, the estimation under a general convex structural 
constraint is introduced with an efficient algorithm for finding 
the estimator derived based on the majorization minimization 
(MM) algorithm framework. Then, the algorithm is tailored to 
several special structures that enjoy a wide range of applications 
in signal processing related fields, namely, sum of rank-one 
matrices, Toeplitz, and banded Toeplitz structure. In addition, 
two types of non-convex structures, i.e., the Kronecker structure 
and the spiked covariance structure, are also discussed, where 
it is shown that simple algorithms can be derived under the 
guidelines of MM. Numerical results show that the proposed 
estimator achieves a smaller estimation error than the benchmark 
estimators at a lower computational cost. 

Index Terms —Robust estimation, Tyler’s M-estlmator, struc¬ 
tural constraint, majorization minimization. 


I. Introduction 

Estimating the covariance matrix is a ubiquitous problem 
that arises in various fields such as signal processing, wireless 
communication, bioinformatics, and financial engineering m, 
0, a. It has been noticed that the covariance matrix in 
some applications naturally possesses some special structures. 
Exploiting the structure information in the estimation process 
usually implies a reduction in the number of parameters to be 
estimated, and thus is beneficial to improving the estimation 
accuracy 0. Various types of structures have been studied. 
Eor example, the Toeplitz structure with applications in time 
series analysis and array signal processing was considered in 
0 , 0 , 0 . A sparse graphical model was studied in 0 , 
where sparsity was imposed on the inverse of the covariance 
matrix. Banding or tapering the sample covariance matrix 
was proposed in 0 . A spiked covariance structure, which 
is closely related to the problem of component analysis and 
subspace estimation, was introduced in 0. Other structures 
such as group symmetry and the Kronecker structure were 
considered in Qoi, ini, ina. 
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While the previously mentioned works have shown that 
enforcing a prior structure on the covariance estimator im¬ 
proves its performance in many applications, most of them 
either assume that the samples follow a Gaussian distribution 
or attempt to regularize the sample covariance matrix. It has 
been realized that the sample covariance matrix, which turns 
out to be the maximum likelihood estimator of the covariance 
matrix when the samples are assumed to be independent 
identically normally distributed, performs poorly in many real- 
world applications. A major factor that causes the problem is 
that the distribution of a real-world data set is often heavy¬ 
tailed or contains outliers. In this case, a single erroneous 
observation can lead to a completely unreliable estimate ifTSll . 

A way to address the aforementioned problem is to find a 
robust structured covariance matrix estimator that performs 
well even if the underlying distribution deviates from the 
Gaussian assumption. One approach is to refer to the minimax 
principle and seek the “best” estimate of the covariance for 
the worst case noise. To be precise, the underlying probability 
distribution of the samples / (•) is assumed to belong to an 
uncertainty set of functions F that contains the Gaussian 
distribution, and the desired minimax robust estimator is the 
one whose maximum asymptotic variance over the set F is 
less than that of any other estimator. Two types of uncertainty 
sets F, namely the £-contamination and the Kolmogorov class, 
were considered in m, where a structured maximum likeli¬ 
hood type estimate (M-estimate) was derived as the solution of 
a constrained optimization problem. The uncertainty set that 
we are interested in is the family of elliptically symmetric 
distributions. It was proved by Tyler in na that given K- 
dimensional independent and identically distributed (i.i.d.) 
samples jy drawn from an elliptical distribution, 

the Tyler’s estimator defined as the solution to the fixed-point 
equation 

K ^ V 
R = — 

7V^xfR-ix,’ 

is a minimax robust estimator. Additionally, it is “distribution- 
free” in the sense that its asymptotic variance does not depend 
on the parametric form of the underlying distribution. 

The problem of obtaining a structured Tyler’s estimator was 
investigated in the recent works ifTbll and iflTl . In particular, the 
authors of ifTbll focused on the group symmetry structure and 
proved that it is geodesically convex. As the Tyler’s estimator 
can be defined alternatively as the minimizer of a cost function 
that is also geodesically convex, it is concluded that any local 
minimum of the cost function on a group symmetry constraint 
set is a global minimum. A numerical algorithm was also 
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proposed to solve the constrained minimization problem. In 
im, a convex structural constraint set was studied and a 
generalized method of moments type covariance estimator, 
COCA, was proposed. A numerical algorithm was also pro¬ 
vided based on semidefinite relaxation. It was proved that 
COCA is an asymptotically consistent estimator. However, the 
algorithm suffers from the drawback that the computational 
cost increases as either N or K grows. 

In this paper, we formulate the structured covariance esti¬ 
mation problem as the minimization of Tyler’s cost function 
under the structural constraint. Our work generalizes m 
by considering a much larger family of structures, which 
includes the group symmetry structure. Instead of attempting 
to obtain a global optimal solution, which is a challenging 
task due to the non-convexity of the objective function, we 
focus on devising algorithms that converge to a stationary 
point of the problem. We first work out an algorithm frame¬ 
work for the general convex structural constraint based on 
the majorization minimization (MM) framework, where a 
sequence of convex programming is required to be solved. 
Then we consider several special cases that appear frequently 
in practical applications. By exploiting specific problem struc¬ 
tures, the algorithm is particularized, significantly reducing 
the computational load. We further discuss in the end two 
types of widely studied non-convex structures that turn out 
to be computationally tractable under the MM framework; 
one of them being the Kronecker structure and the other one 
being the spiked covariance structure. Although theoretically 
the algorithms can only be proved to converge to a stationary 
point, for all the specific structures that are considered in this 
paper, it is observed that the proposed algorithms converge to a 
unique point (in R) with random initialization in the numerical 
simulations. 

The paper is organized as follows. In Section II, we intro¬ 
duce the robust covariance estimation problem with structural 
constraint and derive a majorization minimization based al¬ 
gorithm framework for the general convex structure. Several 
special cases are considered in Section III, where the algorithm 
is particularized obtaining higher efficiency by considering 
the specific form of the structure. Section IV discusses the 
Kronecker structure and the spiked covariance structure, which 
are non-convex but algorithmically tractable. Numerical results 
are presented in Section V and we conclude in Section VI. 


II. Tyler’S Estimator with Structural Constraint 

Consider a number of N samples {xi,...,X 7 v} in 
drawn independently from an elliptical underlying distribution 
with density function as follows: 

/(x) = det (Ro)“^ g (x'^Rq ^x) , (1) 

where Rq G is the scatter parameter that is proportional 
to the covariance matrix if it exists, and g (•) characterizes the 
shape of the distribution. Tyler’s estimator for Rq is defined 
as the solution to the following fixed-point equation: 


K X 

R = _ 

W^xfR-ix,’ 
1—1 ^ 


( 2 ) 


which can be interpreted as a weighted sum of rank one 
matrices x^x^ with the weight decreasing as x^ gets farther 
from the center. It is known that if x is elliptically distributed, 

then the normalized random variable s = -ir^ follows 

I|x|l2 

an angular central Gaussian distribution with the probability 
density function (pdf) taking the form 

/ (s) oc det (Ro)~^ (s^Rjj'^s) ^ . (3) 

Tyler’s estimator coincides with the maximum likelihood es¬ 
timator (MLE) of Ro by fitting the normalized samples {s^} 
to / (s). In other words, the estimator R is the minimizer of 
the following cost function 

K ^ 

L (R) = log det (R-) + ^ X! (xf R~^x*) (4) 

i^l 

on the positive definite cone S+_|_. The estimator is proved 
to be consistent and asymptotically normal with the variance 
independent of g{ ). Eurthermore, it is a minimax robust 
covariance estimator with the underlying distribution being 
elliptically symmetric IfTSlI . 

It has been noticed that in some applications, the covariance 
matrix possesses a certain structure and taking account this 
information into the estimation yields a better estimate of Rq 
M, nni, m, na. Motivated by this idea, we focus on 
the problem of including a prior structure information into 
the Tyler’s estimator to improve its estimation accuracy. To 
formulate the problem, we assume that Rq is constrained 
in a non-empty set S that is the intersection of a closed 
set, which characterizes the covariance structure, and the 
positive semidefinite cone and then proceed to solve the 
optimization problem: 

minimize log det (R) + -^ X] R-'^x*) 

i=i 

subject to R G iS. 

The minimizer R of the above problem is the one in the struc¬ 
tural set S that maximizes the likelihood of the normalized 
samples {s^}. 

Throughout the paper, we make the following assumption. 
Assumption 1: The cost function i(Rf) —>■ -I-cxd when the 
sequence {Rf} tends to a singular limit point of the constraint 
set S. 

Under this assumption, the case that R is singular can be 
excluded in the analysis of the algorithms hereafter. 

Note that the assumption 

Assumption 2: / (x) is a continuous probability distribution, 
and iV > a: , 

implies L (Rt) —?■ -l-c» whenever R^ tends to the boundary 
of the positive semidefinite cone S;^ with probability one lfT9l 
. It is therefore also a sufficient condition for the assumption 
to be held as 5 C S;^. 

Problem Q is difficult to solve for two reasons. Eirst, the 
constraint set S is too general to tackle. Second, even if S 
possesses a nice property such as convexity, the objective 
function is still non-convex. Instead of trying to find the global 
minimizer, which appears to be too ambitious for the reasons 
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pointed out above, we aim at devising efficient algorithms that 
are capable of finding a stationary point of Q. We rely on 
the MM framework to derive the algorithms, which is briefly 
stated next for completeness. 


A. The Majorization Minimization Algorithm 
For a general optimization problem 
minimize h (x) 

X 

subject to X € A”, 


( 6 ) 


where X is a closed convex set, the MM algorithm finds a 
stationary point of (|6]l by successively solving a sequence of 
simpler optimization problems. The iterative algorithm starts 
at some arbitrary feasible initial point xq, and at the {t + l)-th 
iteration the update of x is given by 


Xt+i = argminp (xlxj), 

with the surrogate function p(x|xt) satisfying the following 
assumptions: 


^(Xi) = p(xt|xt), Vxj G A” 

/i(x) < p(x|x 4 ), Vx,xt G-T (7) 

/i'(xt;d) = p'(xt;d|xt) , Vxt + d G A", 

where h' (x; d) stands for the directional derivative of h (•) at 
X along the direction d, and p (x|x() is continuous in both x 
and Xj. 

It is proved in that any limit point of the sequence 
{xj} generated by the MM algorithm is a stationary point of 
problem (|6]l. If it is further assumed that the initial level set 
{x|/i(x) < h (xq)} is compact, then a stronger statement, as 
follows, can be made: 


lim d {xt,X*) = 0, 

t —^“|“00 


where X* stands for the set of all stationary points of (|6]l. 

The idea of majorizing h (x) by a surrogate function can 
also be applied blockwise. Specifically, x is partitioned into m 
blocks as X = (x^^\ ..., where each -dimensional 

block x(*) G a; and A" = 0™ i 

At the (t -f l)-th iteration, x^®^ is updated by solving the 
following problem: 


minimize 

x(*) 

subject to 


x(®) G A”, 


( 8 ) 


with i = it mod m) + l and the continuous surrogate function 
gi (x(®^|xj) satisfying the following properties: 


h{xt) 



x«,...,x«) 


G A-,, 

h' (xt;d°) 




Vxj®^ -1- di G a;, 



d° = (0;...;dG. 

..;0) 


In short, at each iteration, the block MM applies the ordinary 
MM algorithm to one block while keeping the value of the 
other blocks fixed. The blocks are updated in cyclic order. 

In the rest of this paper, we are going to derive the specific 
form of the surrogate function g (R|Rt) based on a detailed 
characterization of various kinds of S. In addition, we are 
going to show how the algorithm can be particularized at a 
lower computational cost with a finer structure of S available. 
Before moving to the algorithmic part, we first compare our 
formulation with several related works in the literature. 

B. Related Works 

In m, the authors derived a minimax robust covariance 
estimator assuming that / (x) is a corrupted Gaussian distri¬ 
bution with noise that belongs to the e-contamination class and 
the Kolmogorov class. The estimator is defined as the solution 
of a constrained optimization problem similar to (|5]l, but with 
a different cost function. Apart from the distinction that the 
family of distributions we consider is the set of elliptical 
distributions, the focus of our work, which completely differs 
from M, is on developing efficient numerical algorithms for 
different types of structural constraint set S. 

Two other closely related works are M and ifTTl . In EH, 
the authors have investigated a special case of Q, where 
S is the set of all positive semideflnite matrices with group 
symmetry structure. It has been shown that both L (R) and 
the group symmetry constraint are geodesically convex, there¬ 
fore any local minimizer of (|5]i is global. Several examples, 
including the circulant and persymmetry structure, have been 
proven to be a special case of the group symmetry constraint. 
A numerical algorithm has also been provided that decreases 
the cost function monotonically. Our work includes the group 
symmetry structure as a special case since the constraint 
is linear, and provides an alternative algorithm to solve the 
problem. 

In Ell, the authors have considered imposing convex con¬ 
straint on Tyler’s estimator. A generalized method of moment 
type estimator based on semideflnite relaxation defined as the 
solution of the following problem: 

1 ^ 

i—1 

1 ^ ( 9 ) 

subject to R ^ —diXjX^ , \/i = 1,..., N, 
di > 0, yi = 1,... ,N, 

was proposed and proved to be asymptotically consistent. 
Nevertheless, the number of constraints grows linearly in 
N and as it was pointed out in the paper, the algorithm 
becomes computationally demanding either when the problem 
dimension K or the number of samples N is large. On 
the contrary, our algorithm based on formulation Q is less 
affected by the number of samples N and is therefore more 
computationally tractable. 

III. Tyler’s Estimator with Convex structural 

CONSTRAINT 

In this section, we are going to derive a general algorithm 
for problem (|5]l with S being a closed convex subset of S;^, 


minimize 

Re5.di 
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which enjoys a wide range of applications. For instance, the 
Toeplitz structure can be imposed on the covariance matrix 
of the received signal in direction-of-arrival estimation (DOA) 
problems. Banding is also considered as a way of regularizing 
a covariance matrix whose entries decay fast as they get far 
away from the main diagonal. 

Since S is closed and convex, constructing a convex sur¬ 
rogate function g (R|Rt) for L (R) turns out to be a natural 
idea since then Rt+i can be found via 

Rt+i = argmin g (R|Rt), (10) 

which is a convex programming. 

Proposition 1. At any Rt 0, the objective function L (R) 
can be upperbounded by the convex surrogate function 

g (R|R0 = Tr (R^'R) + f ^ 

with equality achieved at R*. 

Proof: Since logdet(-) is concave, logdet(R) can be 
upperbounded by its first order Taylor expansion at Rt: 


Algorithm 1 Robust covariance estimation under convex 
structure 


1: Set f = 0, initialize Rt to be any positive definite matrix. 

2: repeat 


3: 

Compute Mt-% 


4: 

Update Rt+i as 



Rt+i = arg min Tr (Rj"^R)-(-Tr 

1(15) 


Rt+i = Rt+i/Tr ^Rt+i^ . 

(16) 

5: 

t i — t -|- 1. 



6: until Some convergence criterion is met 


equivalent to solving 

K 

minimize log det (R) + — ^ log (xf R~^x,;) 
subject to Tr (R) = 1. 

The conclusion follows by a similar argument to Proposition 
17 in lEI]. ■ 


log det (R) < log det (Rt) + Tr (Rj ^R) — 7T (12) 


with equality achieved at Rt. 

Also, by the concavity of the log (•) function we have 


log (x) < log a H-1, Va > 0, 

a 


(13) 


which leads to the bound 


log(xfR ^x,) < 




log (xfRt ^x,) - 1 


with equality achieved at Rt. ■ 

The variable R then can be updated as (fTOl i with surrogate 
function (fTTI) . 

By the convergence result of the MM algorithm, it can 
be concluded that every limit point of the sequence {Rt} is 
a stationary point of problem (|5]). Note that for all of the 
structural constraints that we are going to consider in this 
work, the set S possesses the property that 


R € 5 iff rR € 5, Vr > 0. (14) 


Since the cost function L (R) is scale-invariant in the sense 
that L (R) = L (rR), we can add a trace normalization 
step after the update of Rt without affecting the value of 
the objective function. The algorithm for a general convex 
structural set is summarized in Algorithm [T] 

Proposition 2. If the set S satisfies ( 1741) . then the sequence 
generated by Algorithm Q] satisfies 

lim d(Rt,5*) = 0, (17) 

t—foo 

where S* is the set of stationary points of problem dJ]). 

Proof: Since the objective function L (R) is scale- 
invariant, and the constraint set satisfies (O, solving © is 


A. General Linear Structure 

In this subsection we further assume that the set S is the 
intersection of and an affine set A. The following lemma 
shows that in this case, the update of R (eqn. (fTSl l) can be 
recast as an SDR 


Lemma 3. Problem ( 115b is equivalent to 


minimize 

s,rg5 


subject to 


Tr{Rf^R)+Tr{MtS) 


S I 
I R 


(18) 


^ 0 , 


in the sense that if (S*,R*) solves ( lidb , then R* solves ( 115b . 


Proof: Problem (fTSl l can be written equivalently as 

minimize Tr (Rj”^R) + Tr (M^S) 

S ,R.G<S 

subject to S = R~^. 


Now we relax the constraint S = R^ ^ as S ^ R By the 
Schur complement lemma for a positive semidefinite matrix, 
if R 0, then S ^ R^^ is equivalent to 


Therefore (fTSl l is a convex relaxation of (flST l. 

The relaxation is tight since Tr (MtS) > Tr (MtR“^) if 
Mt > 0 and S ^ R“^. ■ 

Lemma [3 reveals that for linear structural constraint. Algo¬ 
rithm [T] can be particularized as solving a sequence of SDPs. 
An application is the case that R can be parametrized as 


L 


R — o,jRj 
7=1 


(19) 


with Oj G K being the variable and Bj G being the 

corresponding given basis matrix, and R is constrained to be 
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in Using expression ( fT9l l, the minimization problem (fTST l 
can be simplified as 


minimize 

s.Kl 


subject to 


L 

^a,Tr(R-iB,) 

j=i 


S I 


+ Tr(MtS) 

h 0. 


( 20 ) 


IV. Tyler’s Estimator with Special Convex 
Structures 

Having introduced the general algorithm framework for a 
convex structure in the previous section, we are going to 
discuss in detail some convex structures that arise frequently 
in signal processing related fields, and show that by exploiting 
the problem structure the algorithm can be particularized with 
a significant reduction in the computational load. 


A. Sum of Rank-One Matrices Structure 

The structure set S that we study in this part is 


5 = 


R|R = 



( 21 ) 


where the a^’s are known vectors in C^. The matrix R can 
be interpreted as a weighted sum of given matrices 

As an example application where structure (EB appears, 
consider the following signal model 


X = A/3 + £, (22) 

where A = [ai,..., a/,]. Assuming that the signal (3 and noise 
e are zero-mean random variables and any two elements of 
them are uncorrelated, then the covariance matrix of x takes 
the form 

L 

Cov (x) = (23) 

i=i 

where pj = Var(/3j) is the signal variance and S = 
diag (tTi,..., (Tk) is the noise covariance matrix. 

Define p = [pi,.. -Pl]^ and P = diag (p), then R can be 
written compactly as R = APA^ + S. Further define 

P =diag(pi,...,PL,cri,...,crif) 

A = [A,I] 

then R = APA^. Therefore, without loss of generality, we 
can focus on the expression R = APA^, assuming that every 
K columns of A are linearly independent and L > K. 

Note that in example (l22l i. R is complex-valued and prob¬ 
lem 0, which is formulated based on the real-valued elliptical 
distribution / (x), needs to be modified to 

K ^ 

logdet(R) + -^log(xfR ^x,) 

i—1 ^ ^ 

subject to R = APA^, 

where the x^’s are assumed follow a complex-valued elliptical 
distribution instead. 


Since R is linear in the pj’s. Algorithm [T] can be applied. 
In the following, we are going to provide a more efficient 
algorithm by substituting R = APA^ into the objective 
function L (R) and applying the MM procedure with P being 
the variable. 


Proposition 4. Af any Pj >- 0, the objective function 


L (P) = log det (APA^) + ^ ^ log (xf (APA^) x,) 

(26) 

can be upperbounded by the surrogate function 

5 (P|Pt) = wfp + -I-const. (27) 

with equality achieved at 'P — P^, where p“^ stands for the 
element-wise inverse of p, and 


Rt = APtA^ 


Mt 

Wt 

dt 


A^xfRr^x, 
diag (A^Rf^A) 
diag (PtA^Rf^MtRf^APt) ■ 


(28) 


Proof: First, observe that inequalities (fTB and (fTsT l imply 

that 


L (P) < wf p -b Tr (MtR-i) -f const. (29) 

with equality achieved at P = P^. 

Assume that P 0, from the identity 

„ r Rf^APtP-^PtA^Rf^ I 
[ I APA^ 

Rt"^APtP-i/2 
APi/2 

we know that S ^ 0. By the Schur complement, S ^ 0 is 
equivalent to 

Rf^APtP-^PtA^Rf^ h {APA^y^. (30) 

Since Mt ^ 0, we have 

Tr (MtR-i) < Tr (MtRt^^APtP-^Pt A^R^-^) (31) 

with equality achieved at P = P^. 

Since R 0, the left hand side of OTl i is finite. Therefore 
(I 3 TI) is also valid for P ^ 0. Substituting (|3T1) into (|2^ yields 
the surrogate function (l27l) . ■ 

The update of P then can be found in closed-form as 


[ p-i/2pjA^Rt-i pi/2A^ ] , 


iPj)t+i = \Jidj)J iwj)^. (32) 

The algorithm is summarized in Algorithmic 

Compared to Algorithm [T] in which the minimization prob¬ 
lem (fTOl) has no closed-form solution and typically requires an 
iterative algorithm, the new algorithm only requires a single 
loop iteration in p and is expected to converge faster. 
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Algorithm 2 Robust covariance estimation under sum of rank- 
one matrices structure_ 

1: Set f = 0, initialize pt to be any positive vector. 

2: repeat 

3: Rt = APiA^, Rt = Rt/T^Rt) . 

4: Compute Mt, Wt, dt with (|28| 

5: iPj)t+l = 

6: t i — t -f 1. 

7: until some convergence criterion is met 


B. Toeplitz Structure 

Consider the constraint set being the class of real-valued 
positive semidefinite Toeplitz matrices Tk- If R € Tk, then it 
can be completely determined by its first rowQ [ro,..., rx-i]. 

In this subsection, we are going to show that based on the 
technique of circulant embedding, Algorithm |2] can be adopted 
to solve the Toeplitz structure constrained problem at a lower 
cost than applying the sequential SDP algorithm (Algorithm 

ffl). 

The idea of embedding a Toeplitz matrix as the upper-left 
part of a larger circulant matrix has been discussed in a, a, 
1221. It was proved in 0 that any positive definite Toeplitz 
matrix R of size KxK can be embedded in a positive definite 
circulant matrix C of larger size L x L parametrized by its 
first row of the form 

[ro, r-i,..., rx-i,*, ■■■,*, tk-i, • ■ •, t’i] , 

where * denotes some real number. R then can be written as 

R = [ Ik 0 ]c[ Ik 0 ]^. (33) 

Clearly, for any fixed L, if C is positive semidefinite, so is 
R. However, the statement is false the other way around. In 
other words, the set 

Tk = {r\R=[Ik 0]C[Ik 0]^,CeC'i}, 

(34) 

where Cl denotes the set of real-valued positive semidefinite 
circulant matrices of size L x L, m ^ subset of Tk- 

Instead of Tk, we restrict the feasible set to be Tk with 
L > 2K — 1. Since a symmetric circulant matrix can be 
diagonalized by the Fourier matrix, if R G Tk then it can 
be written as 

R = Adiag(po,---,PL-i)A^, (35) 

where 

A = [ Ik 0]Fl, (36) 

with F L being the normalized Fourier transform matrix of size 
L X L and pj = PL-j, Vj = 1,..., L — 1. 

* Following the convention, the indices for the Toeplitz structure start from 


The robust covariance estimation problem over the restricted 
set of Toeplitz matrices T^ then takes the form 

K , 

minimize log det (R) + — > log (xf^R“^Xi) 

R,PbO TV ^ ^ ^ 

(37) 

subject to R = APA^ 

Pj =PL-j, Vj = 1,...,T- 1, 

which is the same as (ES except that the last equality 
constraint on the p/s. 

By Proposition m the inner minimization problem takes the 
form 

minimize wf p -f dj 

(38) 

subject to Pj = PL-j, Vj = 1,..., L — 1. 

Note that by the property of the Fourier transform matrix, we 
have Sij = ^L-j, Vj = 1,... ,L — 1, where the upper bar 
stands for element-wise complex conjugate. As a result, for 
j = l,...,L-l, 

iWj)t = 

which implies that the constraint pj = PL-j will be satisfied 
automatically. 

The algorithm for the Toeplitz structure based on circulant 
embedding is summarized in Algorithm |3] Notice that Algo¬ 
rithm [3] can be generalized easily to noisy observations by the 
augmented representation (l24li. 


Algorithm 3 Robust covariance estimation under the Toeplitz 
structure (Circulant Embedding) 

1: Set L to be an integer such that L > 2K — 1. 

2 : Construct matrix A = [ Ik 0 ] Fl 
3: Call Algorithmic] 


C. Banded Toeplitz Structure 

In addition to imposing the Toeplitz structure on the co- 
variance matrix, in some applications we can further require 
that the Toeplitz matrix is fc-banded, i.e., rj = 0 if j > k. For 
example, the covariance matrix of a stationary moving average 
process of order k satisfies the above assumption. One may 
also consider banding the covariance matrix if it is known 
in prior that the correlation of Xt and xt-r decreases as t 
increases. 

Based on the circulant embedding technique introduced in 
the last subsection, the problem can be formulated as 

K , 

minimize log det (R) -b — log (xf^R“^Xi) 

R.P>-0 N ^ ^ ' 

- i=\ 

subject to R = APA^ (40) 

Pd =PL-j, Vj = 1 

Tj = 0, Vj = fc -b 1,..., A — 1. 


0 . 
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By Proposition |4] the inner minimization problem becomes 
minimize wj p + d[ p~ ^ 

p>0 

subject to pj = PL-j, Vj = 1,..., L — 1 
r, =0, Vj = k + 1,...,K-1, 
which can be rewritten compactly as 
minimize wf p + dj p“^ 

p>0 

subject to [ Oi^K-k-i)xk+i Ix-fc-i ] Ap = 0 (42) 

Pj =PL-j, Vj = 1,... - 1. 

Define real-valued quantities 


A = Re 


w = 

d = 


I v^ao,ai,... ,a|-_^-| j I 

V2wo, wi ,..., wvL-i 


do/V2, di 


we have the equivalent problem 


minimize wi p 

p>0 




I] dj/Pj 
j=o 


(43) 

(44) 

(45) 

(46) 


subject to Ap = 0, 

where the variables p and p are related by 

p= Po/V2,pi,...,p^^-^ . (47) 

Compared to (l42li . the equivalent problem has a lower compu¬ 
tational cost as both the number of variables and constraints 
are reduced. The algorithm for the banded Toeplitz structure 
is summarized in Algorithmic] 


Algorithm 4 Robust covariance estimation under the Banded 
Toeplitz structure (Circulant Embedding) 

1: Set L to be an integer such that L > 2K — 1. 

2: Construct matrix A = [ li^ 0 ] and A with (|4^ . 

3: Set t = 0, initialize pt to be any positive vector. 

4: repeat 

5: Rf = APtA^, Rt = Rt/T^Rt) . 

6: Compute Mt, wt, dt with (|2^ . 

7: Compute w and d with (l44li and (05]), and update p 

as the minimizer of (06]). 

8: Compute p with (07]), Pt ^ P 

9: t i — f 1 

10: until some convergence criterion is met 


D. Convergence Analysis 

We consider Algorithm |2] and the argument for Algorithms 
[3] and |4] would be similar. 

As Proposition H] requires Pt 0, we consider the follow¬ 

ing e-approximation of problem (i25b : 

minimize log det (R + eAA^) 

R,p>0 ^ ^ 

+ § E log + eAA^) X,) (48) 

subject to R = APA^ 


with e > 0, where the upperbound derived in Proposition |4] 
can be applied for P = P -f el. Algorithm |2] can be easily 
modified to solve problem (08]), and under Assumption 1, the 
limit point of the sequence {Pt} generated by Algorithm |2] 
converges to the set of stationary points of (1481) . 

That is, if (p*^)* is a limit point of {pj}, then 

VL'^((p'^)*)^d>0 (49) 

for any feasible direction d, where ^(p'^)*) is the gradient 
of the objective function (p) at (p*^) . 

Proposition 5. Under Assumption 1, let be a positive 

sequence with lim = 0, then any limit point p* of the 
k —^“1”00 

sequence {(p'^'')*} is a stationary point of problem ( 1251) . 

Proof: The conclusion follows from the continuity of 
((p*^)*) in (p*^)* and e under Assumption 2. ■ 

In practice, as e can be chosen as an arbitrarily small num¬ 
ber, directly applying Algorithms |2] |3 and 0] or adapting them 
to solving the e-approximation problem would be virtually the 
same. 

V. Tyler’S Estimator with Non-Convex Structure 

In the previous sections we have proposed algorithms for 
Tyler’s estimator with a general convex structural constraint 
and discussed in detail some special cases. Eor the non-convex 
structure, the problem is more difficult to handle. In this 
section, we are going to introduce two popular non-convex 
structures that are tractable by applying the MM algorithm, 
namely the spiked covariance structure and the Kronecker 
structure. 

A. The Spiked Covariance Structure 

The term “spiked covariance” was introduced in ll^ and 
refers to the covariance matrix model 

L 

R = '^Pja.jaJ + a^I, (50) 

where L is some integer that is less than K, and the a^’s 
are unknown orthonormal basis vectors. Note that although 
(l50l ) and (l23l ) share similar form, they differ from each 
other essentially since the aj’s in (l23l ) are known and are 
not necessarily orthogonal. The model is directly related to 
principle component analysis, subspace estimation, and also 
plays an important role in sensor array applications a, M- 
This model, referred to as factor model, is also very popular 
in financial time series analysis ll24l . 

The constrained optimization problem is formulated as 

minimize log det (R) -b — > log (xf R'^Xi) 

R,aj.p>0.o- ° \ ^ ! 

A ^ „ (51) 

subject to R = y^pj^ja/ -b cr I, 

7=1 

AA^ = I, 

where A = [ai, ..., a^j. 












Applying the upperbound ( fT3l l for the second term in the 
objective function yields the following inner minimization 
problem: 


minimize 

Fl,aj ,p>0,cr 

subject to 


N 


logdet(R) + -^-^ 




R = ^PjajaJ 
i=i 

AA^ = I. 


crl, 


(52) 


Although the problem is non-convex, a global minimizer can 
be found in closed-form as 


B. The Kronecker Structure 

In this subsection we consider the covariance matrix that 
can be expressed as the Kronecker product of two matrices, 
i.e., 

R = A0B, (54) 

where A G and B G S^. 

Substituting R = A (g) B into the objective function yields 
the equivalent problem: 

N 

minimize — logTr (A~^Mf B~^Mi) 

Ago.Bgo N ^ ' ' (55) 

i—1 

+ q log det (A) + p log det (B) 


1 


K 


j=L+l 


P 


= A,-(a* 


(53) 


where Ai > • • • > Xk are the sorted eigenvalues of matrix 
K ^ X yiF 

— ^ A — and the u,’s are the associated eigenvectors 

A^xfRj Xi 

1251 . The algorithm for the spiked covariance structure is 
summarized in Algorithm |5] 


Algorithm 5 Robust covariance estimation under the spiked 
covariance structure 


Initialize Rq to be an arbitrary feasible positive definite 
matrix. 

repeat 


Mt = #Ef=i 


-R.7 


Eigendecompose Mj as Mt = XjUjuJ, where 
Ai > • • • > Xk- 


5: Compute (7*, p^, a* with (l5^ 

6: R,+i = + '^*1- 

7 : R-t-l-1 = Rt+l/Tr ^Rt+1^ ■ 

8 : t ^— t -f 1. 

9: until Some convergence criterion is met. 


As the feasible set is not convex, the convergence statement 
of the MM algorithm in ll20l needs to be modified as follows. 

Proposition 6. Any limit point R* generated by the algorithm 
satisfies 

Tr (VL (R*)^ r) > 0, VR G 7^ (R*), 

where Tg (R-*) stands for the tangent cone of S at R*. 

Proof: The result follows by combining the standard 
convergence proof of the MM algorithm ll20l and the necessity 
condition of R* being the global minimal of g (R|R*) over 
an arbitrary set S (see Proposition 4.7.1 in 1261): 

Tr (Vg (R*|R*)^ r) > 0, VR G 7^ (R*). 


where G M'jxp ]y[^ _ yg^ {'s.i). Denote the objective 
function of (fSSl) as L (A, B). 

Note that although the objective function of the equivalent 
problem is still non-convex, the constraint set of the equivalent 
problem (l55]) becomes the Cartesian product of two convex 
sets, which is convex. 

1) Gauss-Seidel: Since L (R) is scale-invariant, we can 
make the restriction that Tr (A) = 1 and Tr (B) = 1 and then 
problem (fSSl) can be solved by updating A and B alternatively. 

Specifically, for fixed B = Bt, we need to solve the 
following problem: 

N 

mimmze log det (A) -|- ^ logTr (A“^Mf 

' i=l 

subject to Tr(A) = l. 

(56) 

Setting the gradient of the objective function to zero yields 
the fixed-point equation 


A = 


N 

_ 

A Tr (A- 

1—1 \ 


Mj’Br^M,- 




(57) 


As objective function of (156b is essentially the same as the 
Tyler’s cost function (HI, an argument similar to Theorem 2.1 
in IBl reveals that the solution to dSll) is unique up to a 
positive scaling factor, and under Assumption 1, the iteration 


A 


N 



i=l 


Tr (AT^Mf Bj-iM,) 



(58) 


converges to the unique global minimum of (l56b as r —-boo. 
Assign Ai+i = lim A^, similarly we have the fixed-point 

r—¥-\-oo 

iteration for B as 


^ q^ M,AyMf 

^ttTr(A,y,MfB,M,) (59) 
B,+i = B/Tr (b) , 
and Bt+i = lim B*". 

r—>-+CO 

Proposition 7. Under Assumption 1, every limit point of the 
sequence {(Aj, Bt)} generated by Algorithm^is a stationary 
point 0/(1551). 


Proof: Application of Proposition 2.7.1 in lIZTl . ■ 
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Algorithm 6 Robust covariance estimation under the Kro- 
necker structure (Gauss-Seidel) 

1 : Initialize Ao and Bq to be arbitrary positive definite 
matrices of size p x p and q x q, respectively. 

2: repeat 

3: Update A with dSSl l. 

4: Update B with (l59T l. 

5: t i — t -f 1. 

6 : until Some convergence criterion is met. 


2 ) Block Majorization Minimization: A stationary point of 
L(A,B) can also be found by block majorization minimiza¬ 
tion algorithm (Block MM). 

By Proposition [T] with the value of Bj fixed to be B(, a 
convex upperbound of L (A, B) on at point At (ignoring 
a constant term and up to a scale factor of q) can be found as 

iMfB-iM,) 

Br'M,) ■ 

(60) 


N 


p(A|At,B0=Tr(Ar'A) 


p Tr (a 


^^Tr(A,- 


Lemma 8. Under Assumption 1, for any At,Bt 0, the 
matrix 


M{At,Bt) 


N 



i=l 


7>(Ar'MfB-iM,) 


is nonsingular. 


Proof: At (At, Bt) (ignoring a constant term and up to a 
scale factor of q) the function L (A, Bt) can be upperbounded 
by 


~g (A|At, Bt) = logdet (A) + Tr (A'^M (At, Bt)) . (61) 


If M(At,Bt) is singular, we can eigendecompose 
M(At,Bt) as M(At,Bt) = Udiag (Ai,..., Ap) 
with Ai = 0, and set A~^ = Udiag (cti, ..., ap) U^. 

Letting ai ^ 0 would lead to p(A|At,Bt) unbounded 
below, which implies L (A, Bt) is also unbounded below and 
contradicts Assumption 1. ■ 

An immediate implication of Lemma0is that g (A|At, Bt) 
is strictly convex on and has a unique closed-form 

minimizer given by 

A,,,=Ar{Apl‘MAP'^y'AU, (62) 


where M = — 

N ^ 


N 




.=1 Tr (At- 


'MfB-i 


M,; 


Symmetrically, we have the the update for B given by 


Bt+i — Bt 


1/2 


B, 


- 1/2 


MB 


-1/2^/2^172 


) 


(63) 


where M = — 


N 


M,Ar+\Mf 


/V^Tr(Aj;\MfBt"iM, 


Proposition 9. Under Assumption 1, every limit point of the 
pair generated by Algorithm \7\ is a stationary point of the 
problem f l55l ). 


Proof: Application of Theorem 2 (a) in Il20ll . ■ 

Compared to Algorithm|6] which is a double loop algorithm. 
Algorithm [T] only performs a single loop iteration. 

Note that with the surrogate function of the form (IfiOl l. we 
can easily impose additional convex structures on A and B, 
and the update is found by solving the convex problem: 

At+i = arg min g (A|At, Bt), 

, (64) 

Bt+i = argminp(B|At-|-i,Bt), 

with A and B being the convex structural constraint sets. 


Algorithm 7 Robust covariance estimation under the Kro- 
necker structure (Block Majorization Minimization) 

1 : Initialize Aq and Bq to be arbitrary positive definite 
matrices of size p x p and q x q, respectively. 

2: repeat 

3: Update A with (l62l i. 

4: Update B with dfi^ . 

5: t i — f -f- 1. 

6 : until Some convergence criterion is met. 


VI. Numerical Results 


In this section, we present numerical results that demon¬ 
strate the effect of imposing structure on the covariance 
estimator on reducing estimation error, and provide a com¬ 
parison of the proposed estimator with some state-of-the-art 
estimators. The estimation error is evaluated by the normalized 
mean-square error, namely 


NMSE 


(r) 


E 


R — Rfi 


IRq 


If 


(65) 


where all of the matrices are normalized by their trace. The 
expected value is approximated by 100 Monte Carlo simula¬ 
tions. In the following, we mainly compare the performance 
of four estimators, namely, the SCM, unconstrained Tyler’s 
estimator (fixed-point equation of (|2]i), COCA (solution to 
(|9]l), and the proposed structure constrained Tyler’s estimator. 
The samples in all of the simulations of this section, if not 
otherwise specified, are i.i.d. following ~ ^/TU, where 
r ^ ^nd u ^ A/(0,Ro). The dimension K is set to be 
15. 


A. Toeplitz Structure 

In this simulation, Rq is set to be a Toeplitz matrix. The 
parameter Rq is set to be R (/3), whose ij-th entry is of the 
form 

(R(/5)).,=^'*-^'- (66) 

Fig. m shows the NMSE of the estimators with (3 = 0.8 . The 
result indicates that the structure constrained Tyler’s estimator 
achieves the smallest estimation error. In addition, we see 
that although the circulant embedding algorithm (Algorithm 
Ell with L = 2K — 1 approximately solves the Toeplitz 
structure constrained problem, it achieves virtually the same 
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Banded Toeplitz (P=0.4, Bandwidth k=3) 


- + - Tyler 
-♦- COCA (SDP) 

A Constrained Tyler (Circulant Embedding) 




20 40 60 80 100 120 140 160 180 200 

Number of Samples 


Figure 1; The estimation error (NMSE) of different estimators Figure 3: The estimation error (NMSE) of different estimators 
under the Toeplitz structure of the form (|66]l. under the banded Toeplitz structure. 


■g 10 

E 


COCA (SDP) 

Constrained Tyler (Sequential SDP) ^ 
A Constrained Tyler (Circulant embedding)] 


A A 


20 40 60 


) 100 120 140 160 180 200 

Number of Samples 


COCA (SDP) 

A Constrained Tyler (Circulant Embedding) 



) 100 120 140 

Number of Samples 


Eigure 2: Average time (in seconds) consumed by COCA Figure 4: Average time (in seconds) consumed by COCA and 
and the constrained Tyler’s estimator via sequential SDP constrained Tyler’s estimator. 

(Algorithm 1) and circulant embedding (Algorithm 2). 


estimation error as imposing the Toeplitz structure and solving 
the problem via the sequential SDP algorithm (Algorithm [T]). 
However, the computational cost of circulant embedding is 
much lower than that of sequential SDP and COCA, as shown 
in the average time cost plotted in Pig. |2] 

B. Banded Toeplitz Structure 

Next we investigate the case that Rq is a fc-banded Toeplitz 
matrix Bk (Ro), where Bk (Ro) defines a matrix with the ij- 
th entry equals to that of Rq if \i- j\ < k, and equals zero 
otherwise. Rq = R(0.4) and the bandwidth k is chosen to 
be 3. The NMSE is plotted in Pig. [3 where the constrained 
Tyler’s estimator achieves the smallest estimation error. Pig. 0] 
plots the average time consumed by COCA and the constrained 
Tyler’s estimator. As the number of semidefinite constraints 
that COCA has is proportional to N, the time consumption 
is approximately linearly increasing in N, while the time cost 
by the algorithm for the constrained Tyler’s estimator remains 
roughly the same as N grows. When N is small, the algorithm 
for COCA runs faster than ours since the scale of the SDP 
that COCA solves is small. In the regime that N is large, the 
computational cost of COCA increases, as reflected both in 
the time and the memory required to run the algorithm. 


In the third simulation, we consider Rq being a non-banded 
Toeplitz matrix with the property that (Ro)y decays rapidly 
as I* — j\ increases. We investigate the cases of Rq = R (0.4) 
(fast decay) and Rq = R (0.8) (slow decay) and impose 
a banded Toeplitz structure on the Tyler’s estimator with a 
varying bandwidth k to regularize the estimator. Pig. |3 shows 
that the smallest error is obtained when A: = 3 in the /? = 0.4 
case, and when A: = 13 in the /3 = 0.8 case. In either case, 
with the right choice of bandwidth k, the regularized estimator 
outperforms the unhanded one when the number of samples is 
relatively small compared to the dimension of the covariance 
matrix to be estimated. 

C. Direction of Arrival Estimation 

In this subsection, we examine the robustness of the pro¬ 
posed estimator in the context of the direction of arrival 
estimation problem with the following signal model: 

X (t) = As (t) + n (t) , 

where A = [a (0i),..., a (0^)] is the steering matrix and n 
is zero mean additive noise. We study the simple case of an 
ideal uniform linear array (ULA) with half-wavelength inter- 
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Regularization by banding (P=0.4) Snapshot N=20, <j=0.1 




Regularization by banding (P=0.8) 



Figure 5: NMSE of the regularized Tyler’s estimator by 
imposing the banded Toeplitz structure of different bandwidth 
k when Ro = R(0.4) and Rq = R(0.8). 


element spacing, where 


a (9) 




g-i'n-(-R'-l) sin(e) 


Assuming that the signal s (t) is a wide-sense stationary 
random process with zero mean, the covariance of x (t) is 

R = ACov (s) A'^ + Cov (n). 


Figure 6: Arrival angle estimated by MUSIC with different 
covariance estimators. 


stepsize of 5°. Fig. |6] shows the estimated arrival direction us¬ 
ing different estimators with the number of snapshots N = 20, 
and only the constrained Tyler’s estimator correctly recovers 
all of the arriving angles. 

Fig. |7] shows the performance of different estimators in 
terms of NMSE and the estimation error of noise subspace 
evaluated by 


- E,E^^ 


(67) 


with N varying from 20 to 200, where Ec denotes the noise 
subspace and Ec denotes its estimate. Eig. 7 (a) reveals that 
the constrained Tyler’s estimator achieves the smallest NMSE 
when N is small, while COCA performs better when N is 
large. However, Eig. 7 (b) indicates that the constrained Tyler’s 
estimator can estimate the noise subspace more accurately for 
all values of N, which is beneficial for algorithms that are 
based on Ec such as MUSIC. 

The average time cost by COCA and the constrained Tyler’s 
estimator is plotted in Eig. 0 It can be seen that the proposed 
method is much faster than COCA. In addition, unlike COCA, 
the consumed time of our algorithm is not sensitive to the 
number of samples N. 


Eurther assume that the signals arriving from different direc¬ 
tions are uncorrelated and that the noise is spatially white, 
i.e., Cov(s) = diag (pi,... ,pl) and Cov (n) = cr^I, the 
covariance model simplifies to be 

L 

R = (6»j) a {9j)^ + a^l. 

j=i 

We assume that the number of signals L is known in prior. 

In our simulation, 5 random signals are assumed arriving 
from directions —10°, 10°, 15°, 35°, 40° with equal power 
p = 1 and the noise power is set to be = 0.1. The received 
signal is assumed to be elliptically distributed. The number of 
sensors is K = 15. 

We first estimate R and then apply the MUSIC algorithm to 
estimate the arriving angles. The performance of SCM, Tyler’s 
estimator, COCA and the constrained Tyler’s estimator are 
compared. Eor the latter two estimators, A is constructed with 
the 9i’s uniformly located on the interval [—7r/2,7r/2] with a 


D. Spiked Covariance Structure 

We construct the true covariance Rq by the following 
model: 

L 

^0 = '^Po&,aj^ + (T% 

where the a^’s are randomly generated orthonormal basis and 
the pj’s are randomly generated corresponding eigenvectors 
uniformly distributed in [0.01,1]. is set to be 0.01. The 
number of spikes L = 10 is assumed to be known in prior. 
The matrix dimension is fixed to be iC = 100, and the number 
of samples is varied from N = 105 to A = 150. As COCA 
applies only for convex structural set and cannot be used here, 
we replace it by the projected Tyler’s estimator, which is a 
two step procedure that first obtains the Tyler’s estimator and 
then performs projection according to ( l53l l. Eig. |9] shows that 
imposing the spiked structure helps in reducing the NMSE and 
subspace estimation error measured by dCTl) . 
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(a) NMSE 



(b) Subspace error 

Figure 7; The estimation error of different estimators under 
the DOA structure: (a) NMSE, (b) estimation error of the noise 
subspace given by different estimators evaluated by ( l67b . 
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Figure 9: The estimation error of different estimators under the 
spiked covariance structure: (a) NMSE, (b) estimation error of 
the noise subspace given by different estimators evaluated by 

(Ell. 
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Figure 8: Average time (in seconds) consumed per data set by 
COCA and constrained Tyler’s estimator. 


E. Kronecker Structure 

The parameters are set to be Aq = I , Bq = R(0.8), 
p = 10, g = 8, in the simulations. We first plot the 
convergence curve of Algorithms |6] and |7] with the number 
of samples A = 4 in Fig. (TO] The two algorithms converges 
in roughly the same number of iterations, and the objective 
value corresponds to Algorithm 0 (block MM) decreases more 
smoothly than Algorithm |6] (Gauss-Seidel), as the latter is 
a double loop algorithm while the former is a single loop 
algorithm. 

Fig. [TT] plots the NMSE of Tyler’s estimator with a Kro¬ 
necker constraint on R and that with both a Kronecker 
constraint on R and a Toeplitz constraint on B. We can see that 
further imposing a Toeplitz structure on B helps in reducing 
the estimation error. 


VII. Conclusion 

In this paper, we have discussed the problem of robustly 
estimating the covariance matrix with a prior structure in¬ 
formation. The problem has been formulated as minimizing 
the negative log-likelihood function of the angular central 
Gaussian distribution subject to the prior structural constraint. 
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Figure 10: Convergence Comparison of Algorithm |6] and 0 
under the Kronecker structure. 



Figure 11: NMSE of Tyler’s estimator with a Kronecker 
structural constraint versus that with both a Kronecker and 
a Toeplitz structural constraint. 


For the general convex constraint, we have proposed a sequen¬ 
tial convex programming algorithm based on the majorization 
minimization framework. The algorithm has been particular¬ 
ized with higher computational efficiency for several specific 
structures that are widely considered in the signal processing 
community. The spiked covariance model and the Kronecker 
structure, although belonging to the non-convex constraint, are 
also discussed and shown to be computationally tractable. The 
proposed estimator has been shown outperform the state-of- 
the-art methods in the numerical section. 
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